New algorithms for learning and pruning oblique decision trees
نویسندگان
چکیده
In this paper, we present methods for learning and pruning oblique decision trees. We propose a new function for evaluating different split rules at each node while growing the decision tree. Unlike the other evaluation functions currently used in literature (which are all based on some notion of purity of a node), this new evaluation function is based on the concept of degree of linear separability. We adopt a correlation-based optimization technique called the Alopex algorithm for finding the split rule that optimizes our evaluation function at each node. The algorithm we present here is applicable only for 2-class problems. Through empirical studies, we demonstrate that our algorithm learns good compact-decision trees. We suggest a representation scheme for oblique decision trees that makes explicit the fact that an oblique decision tree represents each class as a union of convex sets bounded by hyperplanes in the feature space. Using this representation, we present a new pruning technique. Unlike other pruning techniques, which generally replace heuristically selected subtrees of the original tree by leaves, our method can radically restructure the decision tree. Through empirical investigation, we demonstrate the effectiveness of our method.
منابع مشابه
Pruning Regression Trees with MDL
Pruning is a method for reducing the error and complexity of induced trees. There are several approaches to pruning decision trees, while regression trees have attracted less attention. We propose a method for pruning regression trees based on the sound foundations of the MDL principle. We develop coding schemes for various constructs and models in the leaves and empirically test the new method...
متن کاملDecision Forests with Oblique Decision Trees
Ensemble learning schemes have shown impressive increases in prediction accuracy over single model schemes. We introduce a new decision forest learning scheme, whose base learners are Minimum Message Length (MML) oblique decision trees. Unlike other tree inference algorithms,MMLoblique decision tree learning does not over-grow the inferred trees. The resultant trees thus tend to be shallow and ...
متن کاملC 5 . 1 . 3 Decision Tree Discovery
We describe the two most commonly used systems for induction of decision trees for classi cation: C4.5 and CART. We highlight the methods and di erent decisions made in each system with respect to splitting criteria, pruning, noise handling, and other di erentiating features. We describe how rules can be derived from decision trees and point to some di erence in the induction of regression tree...
متن کاملPruning Decision Trees and Lists
Machine learning algorithms are techniques that automatically build models describing the structure at the heart of a set of data. Ideally, such models can be used to predict properties of future data points and people can use them to analyze the domain from which the data originates. Decision trees and lists are potentially powerful predictors and embody an explicit representation of the struc...
متن کاملA framework for bottom-up induction of oblique decision trees
Decision-tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Systems, Man, and Cybernetics, Part C
دوره 29 شماره
صفحات -
تاریخ انتشار 1999